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FOREWORD 



The Investigation described here was conducted under grant LM-94 from the Public 
Health Service National Library of Medicine. 

The project was under the direction of Dr. Susan Artandi, Associate Professor, 

Rutgers, The State University, who was also Principal Investigator. 

Other Rutgers University personnel participating in various phases of the work 
reported here were: 

Mr. Stanley Baxeodale 

Associate Professor, Dept, of Computer Sciences 

Dr. Edward H. Wolf 

Assistant Professor, Statistics Center 

Mir. Donald R. King 

Associate Professor, Dept, of Computer Sciences 

Mrs. Gillian McElrsy 

Mr. Charles W. Davis Research Assistants 

Mrs. Ellen Altman 

A great deal of assistance and advice was received from Dr. Thomas H. Mott, Jr., 
Dean, Graduate School of Library Service, formerly Director, Center for Computer and 
Information Services and Chairman, Department of Computer Sciences, Rutgers University. 

Extensive and valuable consultation was provided by Mr. Charles T. Meadow, National 
Bureau of Standards, formerly IBM Corporation, and Mr. Donald L. Dimitry, IBM Corporation. 

Preceding this Final Report three progress reports were published. 

Project MEDKX), First Progress Report, by Susan Artandi and Stanley Baaendale. 
1968. 

Tbls report describes in detail the indexing algorithm and the indexing program. Some 
modifications in the program were made later and the modified version is included in the 
Third Progress Report. * 

The effectiveness of weights and links in automatic indexing. Project MEDKO, 
Second Progress Report, by Susan Artandi and Edward H. Wolf, 1968. 
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This second report describes work related to the statistical evaluation of the output of 
ihe indexing algorithm. 

Project MEDICO, Third Progress Report, by Susan Artandi and Stanley 
Baxendale, 1969. 

The MEDICO index file, the searching method, and the search program for the 
automatically created file are described in this Third report which also includes the 
revised indexing program. 

Other publications relating to the Project: 

Artandi, S. Automatic indexing of drug information, hi Proceedings of the 
American Documentation Institute, 30th Annual Meeting, October 1967. 
pp. 148-151. 

Artandi, S. and Edward H. Wolf. The effectiveness of automatically generated 
weights and Hnk« in mechanical imiexing. American Documentation, 
20:198-202, July 1969. 

Artandi, S. Automatic indexing of medical literature— The MEDICO system. 

Paper presented at the m International Congress of M edic a l Librarianahip, 
Amsterdam, The Netherlands, May 1969. 

Artandi, S. Computer indexing of medical articles. Journal of Documentation, 
October, 1969. 
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ABSTRACT 



An aii tnwifltiA indexing method is described in which index tags for documents are 
generated by the computer. The computer scans the text of periodical articles and 
automatically assigns to them index terms with their respective weights on the basis of 
4*xp1V»ttly defined text characteristics. A machine file of document references with their 
associated index terms is automatically produced which can be searched on a coordinate basis 
for the retrieval of specified drug-related information. A statistical evaluatio n of the 
output of the tndgring algorithm and information concerning the system's ability to respond 
to specific queries is given. 
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L SUMMARY 



The broad objective of the investigation was to explore the potential and applicability of 
automatic methods for the indexing of drug-related information appearing in English natural 
language text and to find out what can be learned about automatic indexing in general from the 
experience. More specific objectives were the development, implementation, and evaluation of 
an indexing algorithm which will enable the computer to assign automatically index terms to 
documents. 

hi fully automat* c indexing the computer takes the place of the human indexer. It is 
programmed to scan the natural language text of the document and to assign index tags to it 
on the basis of explicitly defined text characteristics. The human indexer no longer makes a 
separate judgment for each document since in automatic indexing the computer repeatedly 
applies the same algorithm in the indexing of each document. 

12 3 

Project MEDICO builds on earlier automatic book indexing research done at Rutgers. ’ ’ 
While chemistry texts were the bases of that work, in Project MEDICO, the primary emphasis 
was on drug-related information appearing in English language periodical articles published in 
the medical literature. 

hi addition to assigning index terms to documents, the MEDICO algorithm was designed 
to compute weights automatically for the various index terms and to establish links between 
index terms and modifiers. The output of the indexing program is a machine searchable file 
of document references and their associated index tags. Each document record in the file 
includes the following: author, title, bibliographic citation, and index terms with their respective 
weights and Chemical Abstracts Registry Numbers. Although the variety of words used in 
natural language text is taken into consideration in indexing, the output of the indexing program 
utilises a controlled vocabulary to facilitate more convenient and less ambiguous searching. 

The MEDICO index file is a direct file on magnetic tape on which index records are 
sequenced according to document accession number. The primary access points 6f the file 
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include four hierarchical levels, and generic searches are easily implemented. Boolean 
searches provide for the retrieval of highly specific information. Prior to searching, the 
Boolean expressions corresponding to the natural language query are formulated by the human 
searcher. Normalization of the query to make it compatible with the index language is 
accomplished automatically by the computer. Several queries can be processed simultaneously 
and the list of references relating to each query is printed out as a separate unit. 

An important aspect of the project was the statistical evaluation of the output of the 
automatic indexing algorithm. The statistical tests were designed to examine the validity of 
the assumptions which formed the bases of the indexing algorithms, with primary emphasis on 
those developed for the computation of weights and for the generation of links. The tests also 
included a comparison of the output generated from the full text of the document and from the 

processing of the abstracts or summaries of the same articles. 

A comparison of the weights assigned to terms using the MEDICO and manual 
procedures gave the same value for 71 percent of the terms generated by either procedure. 

A moderate increase in agreement (78 percent) was observed for terms assigned a weight of 
3 by at least one of the methods. Ninety-eight percent of the weights assigned by manual and 
machine methods were found either in agreement or to differ by a weight of no more than 1. 
Further investigation should demonstrate the effectiveness of weighting using two weights instead 

of three. 

Seventy-two percent ot the links generated by the full tent scan ot the articles in the 
MEDICO procedure were relevant. While writing style did not appear to have an effect on 
the proportion of agreements on weights for the manual and machine methods, the percentage 
of relevant observed was found to be dependent on the author's writing style. The 
proportion of relevant links decreased as the average length of the sentences increased. 

A comparison of the Index terms generated from hill text with those which were 
gener ated from reduced text showed that the proportion of terms Indexed from reduced text 
Is greater for those terms which had high weights In the full text analysis. Elghty-slx 
percent of terms having a weight of 3, 48 percent of the terms having a weight of 3, and 11 
percent of the terms having a weight of 1 in the full text Indexing were also generated from 
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Another aspect of the evaluation was concerned with the system's ability to respond to 
specific queries. The relatively small size of the retrieval file, and the limited scope of the 
subject matter and the necessity to use artificial questions placed considerable limitations on 
this second phase of the evaluation. While performance scores were calculated, they are 
not considered as satisfactory bases for broad generalizations. The qualitative results Indi ca te 
that search strategy is an important factor governing the performance of the system. One of 
the strengths of the system is its allowance for a flexible search strategy. 

The choice of computer for the project was determined by the availability of the IBM 7040 
computer which was part of the installation of the Center for Computer and Information Services 
at Rutgers University. The main programs are written in FORTRAN with some sub- routines 
written in assembly language. 
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IL THE AUTOMATIC INDEXING METHOD 

The Indexing Algorithm 

The automatic indexing algorithm developed in Project MEDICO is based on the 
characteristics and the position of strings of characters constituting words in English natural 
language text. This general approach has been characteristic of the various experimental 
approaches in the field because of limitations imposed by the inadequate understanding of the 
relationship between the meaning of text and the word.? which appear in it. Thus, automatic 
indexing algorithms have not dealt successfully with information that is implicit rather than 
explicit in text. Taking these limitations into consideration, the problem of defining for the 
computer the presence of information that should be indexed must be re- stated, at least for 
the time being, as follows: how to determine text characteristics in terms of the character- 
istics of the strings of characters appearing in it which will indicate to the computer the 
presence of information to be indexed and will cause the computer to take a particular action. 

hi the automatic indexing methods which have been developed such things as the 
frequency of occurrence of words, the co-occurrence of words, the relative position of words, 
and the pattern of the strings of characters constituting words have formed the bases of some 
experimental methods. The MEDICO algorithm uses location and co-occurrence as a basis for 
the assignment of modifiers to index terms, relative frequency of occurrence for the 
computation of weights, and the characteristics of string patterns and a stored dictionary for 
the selection of index terms. 

Input to the computer consisted of the text of English language periodical articles in 
machine readable form and of the abstracts or summaries of the same articles. 

A practical limitation was placed on the scope of the experiment by selecting test 
documents dealing with a particular drug group, namely anticonvulsants. For a working 
definition the drugs were considered anticonvulsants which were classified as such in the major 
drug dictionaries and in the opei^ literature used in the compilation of the dictionary. 
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For the selection of the test documents the definition used by the National Library of 
MpHIpIiia was applied, since the documents were selected from the output of a MEDLARS 
search cm anticonvulsants. 

When the computer scans the document it takes into consideration the uncontrolled 
'Vocabulary" of the text. When creating the index record for the document the computer is 
programmed to switch to the controlled vocabulary of the system. Thus, terms included in 
the stored dictionary used for the selection of index terms can be differentiated by their 
fiiw fHnn and their nature. Differentiated according to their function, the dictionary includes 
two of terms, those which are compared with the text to be indexed and those which 

appear in the index record. 

According to their nature there are the following types of terms: 

1) trade (brand, proprietary) names of individual drugs 

2) chemical names of individual drugs 

3) generic names of individual drugs 

4) names of groups of chemical compounds 

5) names of drug groups according to activity 

6) terms which are other than names of drugs or chemicals. 

The hierarchical relationship which exists between the type of terms included in (1) 
to (5) is utilized in the inAnring algorithm to provide for automatic generic posting. This 
results in an i ndex record which allows access to the information at as many as four 
hierarchical levels. 



Biological activity 
Drug group name 



Anticoyulsants ^ 



^Barbiturate^ (Hydantoins ^ ^ ^ 



Chemical or 
generic name 



Trade name 



^Amobarbttad) ^henohaibtta ^ 



I 

^ Barbamy^ 



r 



X 



3 



Barbamv j ^Isomytal 
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Terms in the dictionary may be single or multiple word terms and their length is not 
limited. Compilation of the dictionary for the project required the identification of those drugs 
which belong to the group of anticonvulsants and the selection of those non-drug terms judged 
to be of Indexing value. While a great deal of effort was made to make the list of anticonvul- 
sants as complete as possible and to establish equivalencies between chemical, generic, and 
trade - jr^es, no claim is made that the list is all inclusive; since the primary concern of the 
project *as to demonstrate the feasibility of the indexing method. 

The principal functions of the dictionary are: 

1) to select index terms on the basis of explicitly defined text characteristics 

2) to control the index language of the index records 

3) to assign '^nckagetf' for inclusion in the index record. 

Packages provide for vocabulary control and for access at various hierarchical levels. 
With each dictionary term a standard package is associated consisting of those terms which 
will appear in the index record whenever the particular dictionary term appears at least once 
in the text being indexed. 

The package for dictionary terms which are chemical or generic names of drugs 
usually includes the following: preferred chemical name, preferred generic name, chemical 
group name and the Chemical Abstracts Registry Number. For dictionary terms which are 
trade names the package includes the trade name in addition to the above. For non-drug 
terms the package consists of the preferred synonym or word form. For example, if 
pentobarbitone sodium would appear in the document text, the following corresponding package 
would be recorded in the index record: 

pentobarbital sodium 

sodium 5-ethyl-5-(l-methylbutyl)barbiturate 

barbiturates 

57330 

The same package would be generated if sodium 5-ethyl-5 -(l-methylbutyl)barbtturate appeared 
in the text. H, however. Nembutal appeared in the text, it would be added to the index record 
because it is a trade name. 

Inclusion of trade names in packages provides for very specific information access 
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pniwta to the retrieval of documents about a particular pharmaceutical product. This 

practice also allows for the system to act as a data retrieval system, since the tracings In 
the output record Include the chemical compound which corresponds to the particular trade 
name that is Indexed. 

Weights and Links 

As Indicated earlier, the Indexing algorithm was designed to compute and to assign 
weights to Index terms and to generate links between Index terms and modifiers automatically. 

Weighting means the assignment of a value to a term to indicate the relative Importance 
of that term In the subject description of the document. In a query ,or In a user profile. Thus, 
an index term whi ch represents a central theme In the document gets a high weighting and one 
which represents only a marginal element In the subject gets a low weighting. Links indicate 
particular connection between terms where the lack of such a link may create ambiguity. 

Both weights and links Increase the specificity of terms. They are precision devices because, 
by increasing the specificity of the term, non-relevant documents are not retrieved. 

Research related to weights and links has been largely centered around systems in 
which they are assigned by the human indexer on the basis of the Indexer's intellectual judgment. 
The algorithm developed in Project MEDICO for the automatic assignment of weights Is based 
on the assumption that the relative frequency of occurrence of terms In text can serve as 
an Indi ra* 1 ™ 1 of the importance of the aibjects the terms represent. Relative, rather than 
absolute frequency. Is used to compensate for differences In length among articles. The 
computer calculates the number of occurrences per thousand text words and converts the 
resulting figure Into a weight in the following manner: H the frequency of the term per 

thousand text words is less than or equal to 1, the document Is assigned a weight of 1. 

H the frequency of the term per thousand text words Is greater than 1 and less than 3, the 
term Is assigned a weight of 2. Finally, If the frequency of the term per thousand text 
words Is greater than or equal to 3, a weight of 3 Is assigned to It. 

The automatic generation of links Is based on the assumption that co-occurrence within 
a sentence Is a satisfactory Indication that the terms belong together within the context of the 
document. Because the test documents were medical articles, and because the emphasis was 
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on drug-related information, links were created primarily between names of drugs or chemicals 
and terms which could 't as modifiers, such as therapy , togdc tty, administration, dosage, etc. 

The following are examples of sentences from which valid links between drug names and 
modifiers were automatically generated. 

"In the review of En glish literature of the past ten years the author has found no other 
reference to the occurrence of hyperglycemia following the administration of diphenylhydatoln 
in humans." 

"It was therefore estimated that he received a total dosage of at least 800 mg. of 
diphenylhydantoin in a 24 hour period or approximately 70 to 80 mg. per kilogram." 

The first sentence generated the index term diphenylhydantoin/administratton and the 
second generated the term diphenylhydantoin/ dosage . 

New Terms and Dictionary Updating 

Inherent in dictionary-based indexing is the risk of missing unknown new information, 
information too new to be included in the dictionary. The problem is to devise some method 
by which a new anticonvulsant, for example, could be indexed when it is first reported in the 
literature. Xatchingf' of new indexable information is also important from the point of view 
of dictionar y updating. While it is possible to select new terms to be added to the dictionary 
manually, it is desirable to make the process at least semi-automatic. 

One approach to the design of algorithms which will enable the computer to find drug 
names in text is to determine some common characteristics which names of drugs have, 
characteristics which will sufficiently distinguish drug names from other terms. Some 
characteristics which have been identified in case of drug names are such things as their 
length; an alternating string pattern of numbers, letters, and dashes; the capitalization of 
registered trade names followed by a capital R (words which begin and end with an upper case 
letter); the presence of such words as ethyl, methyl, etc., or the presence of Greek letters 
in chemical names. 

For example, the indexing program selects terms on the basis of their length to 
complement the dicf unary method. Strings of characters exceeding 18 characters and not 
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contained in the dictionary are put out for visual inspection. Those terms which are judged 
to be useful index terms are used in two ways: 

1) they are added to the index record of the document which generated them 

2) they are used to update the dictionary. 
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m SEARCHING THE MEDICO SYSTEM 

The immediate output of the indexing program is the MEDICO index file. It was 
essential to design this file to be satisfactory for search purposes and at the same time to 
be capable of producing a printout with a format that is convenient to use. 

The MEDICO index file is a direct file of document records on magnetic tape arranged 
in document accession number order. The principal characteristic which distinguishes this 
file from many other document retrieval files is that its content and format is automatically 

generated by the computer from natural l a ng uage text. 

The record for each document in the MEDICO index file consists of the following elements: 

author 

title 

bibliographic citation 
total number erf text words 

index terms without modifiers, with their respective weights and Chemical Abstracts 
Registry Numbers 

index terms with modifiers, with their respective weights 
The fact that each record shows all index terms which were assigned to the document is 
useful because the terms taken together add up to a rudimentary abstract. While they are 
not substitutes for good informative abstracts prepared by good human abstractors, they do 

provide a certain degree of informativeness for the user. 

Figure 1 shows a sample printout of an index record generated from full text, and Figure 

2 shows the index record for the same document generated from reduced text. 

Since the MED ICO file is a direct file, each record stands for a single document as 
opposed to an inverted file in which each record stands for a single index term. Inherent in 
the process of searching a direct file for documents specified by subject is the need to make 
a complete scan of the entire file for each query to be processed. The capability for simultaneous 
searches, processing several queries in a single pass of the tape, can compensate for this 
limitation. 
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Searching is essentially the reverse of indexing, and the preparation of a search 
instruction involves procedures and sources of errors that are very similar to those encountered 
in indexing. The objective of searching is to identify those documents whose content is 
relevant to the query. 

The output of a search may be viewed as the result of the relevance judgment of the 
system. Theoretically, the closer this resembles the relevance judgment of the user the 
better the system performs. In practice, however, the problem is not quite as clearcut; and 
factors influencing both system and user judgment need to be taken into consideration. 

In addition to utilizing the primary access points in the indec record, the MEDICO 
search program is designed to make possible complex Boolean searches using the connectives 
AND, OR, and NOT. Any data element can be combined with any other included in the subject 
description part of the index record, involving the capability for many more possible 
combinations than would be needed in practice. 

While the Boolean expression corresponding to the query is formulated by the human 
searcher, normalization of the search terms to make the query compatible with the index 
record is accomplished automatically by the computer. The index file is searched sequentially 
to find terms as prescribed in the Boolean expression. Several queries can be searched 
simultaneously and the output relating to each query can be printed out as a separate unit. 

The following are examples of queries which were used in testing the system. 

What is the use of 3, 5, 5-trimethyloxazolidine-2, 4 -dione as an anticonvulsant? 

What kind of toxic side effects can be expected when Tegretol is administered? 

What dose of trimethadione is used in the treatment of epileptics? 
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IV. EVALUATION OF THE INDEXING ALGORITHM 

A major part of the evaluation work was concerned with the statistical evaluation of 
the output of the MEDICO automatic indexing algorithm. The statistical tests were intended 
to examine the validity of the assumptions which were the bases of the indexing algorithm with 
primary emphasis on the methods which were developed for the computation of weights and the 
generation of links. Also included in the tests was a comparison between output generated 
from the fuU text of the article and from the processing of the abstracts or summaries of 
the same documents. 

Evaluation of the output of the indexing involved essentially a comparison between the 
judgment of the human indexer and that of the indexing algorithm to see the proportion of 
agreement between the manual and the machine method. The documents were examined by 
a human indexer to check the correctness of the weights appearing in the output of the auto- 
matic indexing program when inteUectual judgment rather than an automatic method was used. 
The links in the same output were checked to determine whether they were correct in the 
context of the document. 

When the method of assigning weights was used, the automatic and manual methods were 
said to agree when both assigned the same weight to a particular index term. A comparison of 
the weights assigned to terms using the MEDICO and manual procedures gave the same value 
for 71 percent of the terms generated by either procedure. A moderate increase in agreement 
(78 percent) was observed when only terms having a weight of 3 by at least one of the methods 
were considered. Ninety-eight percent of the weights assigned by the two methods was found 
to be either in agreement or to differ by a weight of 1. These findings tend to suggest that 
allowing only two weights in the system instead of three would perhaps increase the proportion 
of agreement. However, more research is needed to examine the validity of this assumption. 

In the evaluation of the linking procedure, the purpose of the test was to (1) determine 
the proportion of relevant links, (2) determine whether writing style has an effect on the 
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proportion of relevant links, and (3) consider criteria other than co-occurrence within a 
sentence in defining a link. 

Seventy-two percent of the links generated by a full text scan of the articles by the 
MEDICO procedure were relevant. While writing style does not appear to have an effect on 
the proportion of agreements on weights for the two methods, the percentage of relevant links 
observed was found to be dependent on the author's writing style. The proportion of relevant 
links decreased as the average length of the sentences increased. This suggests that it 
may be desirable to change the definition of a link from co-occurrence within a sentence to 
co-occurrence within two punctuation marks. A preliminary check of the 15 articles studied 
showed that the number of relevant links would increase to 84 percent as a result of such a 
re -definition. Further research is needed to study this assumption in detail. 

Evaluation also included a comparison between the output generated from full text with 
output from abstracts or summaries of the same articles. 

This type of comparison is important because there is reason to believe that the difference 
in cost between the two methods may be considerable. Data relative to the effectiveness of 
the two methods should give some indication of the degree of improvement that can be expected 
for the additional expense involved in full text processing. 

The purpose of the comparison between the two kinds of output was to evaluate the 
following questions: 

(1) Does an abstract provide a better index than a summary paragraph? 

(2) Do the terms which have a weight of 3 in the full text index appear more 
frequently in the reduced text index than terms which have a weight of 2 or 1 ? 

(3) Can we consider the link distance or some function of the average link distance 
in order to segregate relevant and irrelevant links? 

Since no statistically significant difference was found in the two forms of reduced text, 
abstracts and summaries were considered together in the evaluation. 

Because the average number of words in the reduced text was 127, it was impossible 
to rank the importance of terms indexed by assigning weights. It must be assumed that any 
term mentioned in the reduced text is important. This means that terms having a weight of 
3 in the full text index should appear more frequently in the reduced text index than terms which have 
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a weight of 2 or 1. 

A comparison of the index terms generated from full text with those which were generated 
from reduced text confirmed this hypothesis. The test showed that the proportion of terms 
indexed from reduced text is greater for those terms which had high weights in the full 
text analysis. Eighty- six percent of terms having a weight of 3, 46 percent of the terms having 
a weight of 2 , and 11 percent of the terms having a weight of 1 in the full text analysis were 
also generated from reduced text. 

The average of the relevant link distance in reduced text was 3. 4 and the average 
irrelevant link distance was 8.9 words. The irrelevait link distances were quite large when 
compared to the lengths of the relevant links (11, 18, 11, 14). This would indicate that a 
significant reduction in the frequency of irrelevant links might occur by defining a link as the 
co-occurrence within a predetermined distance. For example, suppose we consider only links 
of 10 words or less in distance. This criterion would yield 23 links of which 20 are relevant. 
The upper limit of 10 was obtained by considering the average of the relevant link distances 
(3.4) plus 3 standard deviations of the relevant link distances (2.16). 

It should be pointed out that this observation about link distance is based on a small 
sample size (27). Previous data from the full text analysis indicate that the average of the 
relevant link distances is dependent on the author* s writing style. However, it might 
happen that all authors write in a more concise form in a summary paragraph cr an abstract. 
Future investigations with larger sample sizes should shed more light on this observation. 
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V. EVALUATION OF THE OUTPUT OF THE SYSTEM 

The evaluation of the output of the MEDICO index file was concerned with the system* s 
ability to respond to specific queries. Considerable limitations were placed on this phase 
of the evaluation work by the small size of the index file and by the limited scope of subject 
matter. Both of these factors made it necessary to use artificial questions which in turn 
introduced further limitations. The test results which are presented here should be interpreted 
in the context of the limitation just outlined and should not be considered as figures which can 
form the bases of broad generalizations. 

While for the reasons just explained the usefulness of the quantitative results given 
in this section of the report is somewhat limited, an examination of the qualitative results 
do provide some useful insights for future research. 

Most of the precision failures were due to the inclusion of terms in the articles 
index record which represent peripheral subjects, resulting in high exhaust tv ity in indexing. 

This situation is to some extent inherent in the indexing method. However, it can be controlled 
thrmgh the use of wei ght s in indexing and through the proper utilization of weights in the 
search formulation to achieve the optimum level of generality for a given query. In contrast 
with these precision failures, faulty search strategies, causing recall failures, involved search 
formulations which were too specific. An indexing problem causing recall failures was due 
to the lack of the actual occurrence of a term in text, which, however, was implied. 

On the whole, search strategy emerged as a very important factor governing the 
performance of the system and the capability to allow for a flexible search strategy emerged 
as an important strength of the system. The ability to vary the search strategy through the 
use of links, weights, and search logic illustrates the flexibility of the system, ft is possible 
to search using any version of a drug’s name (chemical, generic, or trade name) because of 
the use of packages and the 'Normalization" of terms. Query 2, for example, causes five 
articles to be retrieved although only one of the sample articles in the file contained the 
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chemical name used in the question. All articles mentioning ihat drug in any form of its 
name were retrieved. 

Questions involving the linking of terms are also automatically 'YiormallzedF' by the 
system. Query 3 calls for articles on 'barbiturates used in the treatment of convulsive 
disorders.” Any article on barbiturates which linked anticonvulsants with the words: treat, 

treated, treatment, therapy, therapeutic, or therapeutic effect, can be retrieved. 

Because of the use of packages, requests for articles on a class of drugs (barbiturates, 
hydantoins, etc. ) will yield articles on any drug in the class, although the class may not be 
mentioned in the article. This capacity also enables the searcher to use a logical not 
strategy (such as in Query 1) and thus eliminate large numbers of non- relevant documents on 
retrieval. 

The measures used in the evaluation were the recall ratio and the precision ratio. 
While these ratios have been used extensively for the measurement of system performance, 
it should be remembered that their application requires considerable subjective judgment on 
the part of the evaluator to determine the relevancy of a document to a query. 

The recall ratio equals the number of relevant documents retrieved over the number 
of known relevant documents in the collection times 100. The precision ratio equals relevant 
documents retrieved over the total number of documents retrieved times 100. When the two 
measures are applied together they should indicate what proportion of the total number of 
relevant documents in the collection has been retrieved and at what cost, in terms of noise, 
a particular performance was achieved. 

Twelve queries were used to search the two MEDICO index files generated from the 
indexing of full text and of reduced text, respectively. The average ratios were calculated 
by averaging the appropriate individual ratios, and the relevance judgments were made by 
a single individual. 

Index File Generated From Full Text Indexing 

The average recall and precision ratios which follow are based on a set of twelve 
queries which were searched in a file generated from the automatic indexing of the full text 




of 30 articles, 
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Over-all precision ratio 78 percent 

Over-all recall ratio 76 percent 

Recall failures occurred in six and precision failures occurred in four of the output 
of the twelve searches which were analyzed. There were eleven articles which were relevant 
but not retrieved. Out of the eleven unretrieved articles three were not retrieved because 
of inadequate indexing and eight were missed because of faulty search strategy. fnAnring 
problems which caused recall failures were due to the following factors: 

— lack of occurrence of a specific term in the text, although implied; 

— lack of a link because of the non-occurrence of a term in a given sentence; and 

— the presence of an incorrect weight. 

Faulty search strategies generally involved search formulations which were too specific, 
demanding too many parameters to satisfy the query. 

Precision failures involved 21 articles. Out of these 21 non-relevant articles nine were 
retrieved because of inadequate indexing and twelve were retrieved because of faulty search 
strategy. 

In de xin g problems causing precision failures were of the following kind: 

— mention of a term in text which is only peripheral to the subject of the article 
causing high exhaustivity in indexing; 

— the presence of an incorrect link; and 

— the level of generality of the terms involved. 

Faulty search strategies causing precision failures generally involved search 
formulations which were too broad, resulting in high recall. 
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Recall Failures 



Cause of 
failure 


Number of 
missed 
articles 


Percent of 
total missed 
articles 


Number of 
searches 
involved 


Percentage of 
total searches 


Indexing 


3 


27 


3 


25 


Searching 


8 


73 


3 


25 


Total 


11 




6 


50 



Precision Failures 



Cause of 
failure 


Number of 
articles 
involved 


Percent of 
total articles 


Number of 
searches 
involved 


Percent of 
total number 
of searches 


Indexing 


9 


43 


2 


16.6 


Searching 


12 


57 


2 


16.6 


Total 


21 




4 


33.3 



Comparison of Full Text and Reduced Text 

The performance tests included a comparison between index files generated from the 
full text and from the reduced text of the same documents. The same twelve queries were 
searched in two smaller files corresponding to fifteen documents. The following figures 




were obtained: 
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Summary of Average Recall and 
Precision Ratios for 12 Searches 





Index file 
from full text 


Index file 
from reduced 
text 




(percent) 


(percent) 


Average precision ratio 


81.8 


50 


Average recall ratio 


82.5 


40.3 



Data Retrieval 

hi addition to the twelve queries used in the evaluation tests, several queries were 
used to move the system* s ability to retrieve data as distinguished from the retrieval of 
documents. For example, the response to the query: 

"What is the chemical nam e and Chemical Abstracts registry number of the 
succinimides which are used as anticonvulsants 7" is a list of records containing the chemical 
names and Chemical Abstracts registry numbers of drugs of the succinimide group. These 
names and numbers are not necessarily contained in the articles to which the various records 
refer, but the information is contained in the immediate output of the system. In this instance 
the system acts as a retrieval system, and there is no need to consult any documents 
to find the answer to a query. 

The same is true of the query, "What is the chemical name of Sulthiame T because 
the chemical name will be contained in the index record of an article on Sulthiame, although 
the article itself may not include a mention of the chemical name. 
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List of Queries 
Query 1 

Drugs which are active as anticonvulsants but which are not of the barbiturate, hydantoin, 
or succinimide family. 

Query 2 

The use of 3, 5, 5 -t r imethylexazolidine - 2, 4-dione as an anticonvulsant. 

Query 3 

Barbiturates, with the exception of amobarbital, which are used in the treatment of 
convulsive disorders. 

Query 4 

What hydantoins, other than phethenylate, can be used in anticonvulsant therapy? 

Query 5 

What kind of toxic side effects can be expected when Tegretol is administered? 

Query 6 

The effectiveness of dlphenylhydantoin as a therapeutic agent and the dosage that is 
recommended. 

Query 7 

The dosage of trimethadione used in the treatment of epileptics. 

Query 8 

The administration of diphenylhydantoin sodium in the treatment of trigeminal neuralgia. 
Query 9 

Articles which have as a central topic the use of Primidone as an anticonvulsant. 

(Weight 3 for both Primidone and anticonvulsant) 

Query 10 

Same as Query 9 with a weight of 2 for anticonvulsants. 

Query 11 

Articles dealing with axazolidinediones. 

Query 12 




Articles on barbiturates, 



